AP&T Alimentary Pharmacology and Therapeutics 



Nine scoring models for short-term mortality in alcoholic 
hepatitis: cross-validation in a biopsy-proven cohort 

V. Papastergiou*, E. A. Tsochatzis*, G. Fieri*, E. Tlialassinos*, A. Dinar*, S. Bruno^, S. Karatapanis*, T. V. Luong^, 
J. O'Beirne*, D. Patch*, D. Tliorburn* & A. K. Burroughs* 



*The Royal Free Sheila Sherlock Liver 
Centre and UCL Institute of Liver and 
Digestive Health, Royal Free Hospital, 
London, UK. 

''"Department of Cellular Pathology, 
UCL Medical School, Royal Free 
Campus, London, UK. 



Correspondence to: 

Prof. A. K. Burroughs, The Royal Free 
Sheila Sherlock Liver Centre, Royal 
Free Hospital, NW3 2QG, London, 
UK. 

E-mail: andrew.burroughs@nhs.net 



Publication data 

Submitted 22 December 2013 
First decision 8 January 2014 
Resubmitted 18 January 2014 
Accepted 20 January 2014 
EV Pub Online 12 February 2014 

This article was accepted for publication 
after full peer-review. 



SUMMARY 
Background 

Several prognostic models have emerged in alcoholic hepatitis (AH), but 
lack of external validation precludes their universal use. 

Aim 

To validate the Maddrey Discriminant Function (DF); Glasgow Alcoholic 
Hepatitis Score (GAHS); Mayo End-stage Liver Disease (MELD); Age, Biliru- 
bin, INR, Creatinine (ABIC); MELD-Na, UK End-stage Liver Disease 
(UKELD), and three scores of corticosteroid response at 1 week: an Early 
Change in Bilirubin Levels (ECBL), a 25% fall in bilirubin, and the Lille score. 

Methods 

Seventy-one consecutive patients with biopsy-proven AH, admitted between 
November 2007-September 2011, were evaluated. The clinical and biochem- 
ical parameters were analysed to assess prognostic models with respect to 
30- and 90-day mortality. 

Results 

There were no significant differences in the areas under the receiver operat- 
ing characteristics curve (AUROCs) relative to 30-day/90-day mortality: 
MELD 0.79/0.84, DF 0.71/0.74, GAHS 0.75/0.78, ABIC 0.71/0.78, MELD-Na 
0.68/0.76, UKELD 0.56/0.68. One-week rescoring yielded a trend towards 
improved predictive accuracies (30-day/90-day AUROCs: 0.69-0.84/0.77- 
0.86). In patients with admission DF >32 {n = 31), response to corticoster- 
oids according to ECBL, 25% fall in bilirubin and the Lille model yielded 
AUROCs of 0.73/0.73, 0.78/0.72 and 0.81/0.82 for a 30-day/90-day outcome 
respectively. All models showed excellent negative predictive values (NPVs; 
range: 86-100%), while the positive ones were low (range: 17-50%). 

Conclusions 

MELD, DF, GAHS, ABIC and scores of corticosteroid response proved to 
be valid in an independent cohort of biopsy-proven alcoholic hepatitis. 
MELD modifications incorporating sodium did not confer any prognostic 
advantage over classical MELD. Based on excellent NPVs, the models are 
best to identify patients at low risk of death. 
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INTRODUCTION 

Alcoholic hepatitis (AH) is an acute inflammatory hepa- 
tic syndrome occurring in patients with alcohol misuse.^ 
However, the clinical phenotype of AH is very variable. 
There are mild forms, likely to improve with conserva- 
tive management, while severe cases have a high risk of 
death even if treated.^ Currently, corticosteroids, pentox- 
ifylline and N-acetylcysteine are the therapeutic 
options,^"^ although treatment of AH remains controver- 
sial/ A survival benefit conferred by steroids is indeed 
disputed in standard meta-analysis,^' ^ but supported in 
individual patient data analysis/" An ongoing, ade- 
quately powered, UK randomised controlled trial wiU 
probably answer such therapeutic controversies/' Due to 
the potential adverse events associated with corticoster- 
oids (mainly occurrence of sepsis), AH is currently 
managed on a risk-benefit basis. Thus, prognostic strati- 
fication according to short-term mortality is paramount 
both for disease management and to enable clinical trials 
targeting new treatments in AH. 

For over 3 decades, the Maddrey discriminant func- 
tion (DF)'^ has been the standard surrogate for the 
assessment of disease severity and to guide treatment in 
AH. A cut-off value of >32 identified patients who had 
greater than 50% mortality at 30 days and therefore this 
was instituted as the threshold for corticosteroid therapy. 
Over the years, alternative prognostic scores have been 
developed: the Glasgow alcoholic hepatitis score 
(GAHS)'^ and the age, bilirubin, international norma- 
lised ratio, creatinine score (ABIC).'^ Alongside these 
disease-specific formulas, previous studies (including 
3^ 15 -73I6 202" patients) have outlined the utility of 
the model for end-stage liver disease (MELD) for pre- 
dicting mortality in AH, whereas the utility of the MELD 
including sodium (MELD-Na) has been also assessed in 
a small study.'* A further refinement has been to assess 
the response to corticosteroid treatment. In this context, 
the LiUe score'^ with a threshold of 0.45 has been devel- 
oped to identify patients with severe AH who might ben- 
efit from corticosteroids, whereas also use of GAHS with 
a threshold of 9 has been proposed.^" Previously, any fall 
in serum bilirubin levels after 1 week of corticosteroid 
therapy (Early Change in Bilirubin Levels: ECBL),^' or 
more specifically a 25% fall,^^ have been proposed as 
simple indicators of corticosteroid response. 

In recent years, several prognostic models have 
become available in AH, all of them advocated as best 
by their respective authors. External validation and 
model comparisons are therefore required to guide 



selection among the models ior use in routine clinical 
practice. However, diagnosis of AH is challenging and 
patients with other forms of hepatic decompensation 
(such as decompensated cirrhosis with severe jaundice 
and acute alcoholic steatosis) may be erroneously classi- 
fied as AH. This is more likely to happen when liver 
biopsy is not performed, including a transjugular 
approach which can obviate clotting problems.^^ Indeed, 
diagnosis of AH based on clinical grounds has been 
associated with a 10-50% risk of misclassification.^'*"^'' 
Thus, in the present study, we aimed to cross-validate 
nine prognostic indices for short-term mortality using an 
independent cohort of patients with AH confirmed by 
transjugular liver biopsy. This is standard practice in our 
centre whenever AH is suspected. 

PATIENTS AND METHODS 

Study population 

Consecutive patients with a histological diagnosis of AH 
by liver biopsy, between November 2007 and September 
2011, were identified through a computerised pathology 
register. All patients were referred for transjugular liver 
biopsy by their treating physician (who was a hepatolo- 
gist in all cases) due to the clinical suspicion of AH. The 
patients' clinical and biochemical features on admission 
were compatible with a diagnosis of AH, according to 
the following criteria: a) history of alcohol abuse within 
the last 2 months (>40 g/day compatible with or men; 
>20 g/day for women), b) total serum bilirubin exceeding 
2x upper limit of normality (ULN =17 fimol/L), c) 
aspartate to alanine aminotransferase ratio exceeding 1.5 
with aspartate aminotransferase over 45 U/L and c) 
absence of concomitant primary cause of liver disease. 
Patients with pre-existing viral hepatitis (n = 9) were not 
excluded because the clinical basis of their hospital 
admission was due to AH. Demographical and laboratory 
data were extracted by reviewing the electronic medical 
charts. Survival at 30- and 90-days following hospital 
admission was established by chart review or phone con- 
tact, if necessary. Therapy for AH was also assessed. 
According to local protocol, patients with a severe AH 
(DF >32) were given a single daily dose of oral predniso- 
lone 40 mg for 28 days, in addition to supportive ther- 
apy including gastric acid suppressors, high dose vitamin 
B and C, vitamin K, dietary supplements often by enteral 
feeding, and chlordiazepoxide if there were alcohol 
withdrawal symptoms. In those unable to take oral 
medication, 32 mg/day of methylprednisolone were 
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administered intravenously. Patients placed on steroids, 
presence of contraindications to steroid treatment and 
the exact date of initiation of steroid therapy were all 
recorded by retrospectively reviewing medical charts. 

Derivation of prognostic models 

For each patient, laboratory values obtained on the day 
of hospital admission were used to calculate prognostic 



models according to their formulas (Table 1). MELD, 
MELD-Na,^** UKELD,^^ GAHS,'^ ABIC'* and DF'^ were 
all re-calculated using laboratory data from day 7 after 
admission to establish whether 1-week rescoring could 
be associated with an improved predictive performance, 
as outlined previously. The Lille score is a 
combination of six reproducible variables incorporating a 
dynamic one (i.e. the 1-week evolution in bilirubin).^' 



Table 1 | Formulas and included variables in prognostic models for alcoholic hepatitis 



PT/ 

Bilirubin Creatinine INR Age Albumin Urea Leucocytes Na A Bilirubin 



MELD^ 



9.57 X lege (creatinine, m; 


5/dL) + 3.78 


+ + 


+ — — 








X loge (bilirubin mg/dL) 


f 11.20 












X loge (INR) + 6.43 















4.6 X (patient's PT - control PT) 
+ bilirubin (mg/dL) 
*GAHS^^ 

Age (<50 years = 1, >50 years = 2) 
+ Leucocytes (IOVL) (<15 = 1, >15 = 2) 
+ Urea (mmol/L) (<5 = 1, >5 = 2) 
+ PT ratio (<1.5 = 1, 1.5-2.0 = 2, 
>2.0 = 3) + bilirubin (mmol/L) (<125 = 1, 
125-50 = 2, 
<250 = 3) 
ABIC^^ 

(Age in years x 0.1) + (bilirubin 
mg/dL X 0.08) + (creatinine mg/dL 
X 0.3) + (INR X 0.8) 
MELD-Na^® 

MELD - Na - [0.025 x MELD x 
(140 - Na)] + 140 (where the serum 
sodium concentration is bound 
between 125 and 140 mmol per litre) 
UKELD^' 

5 X [1.5 X loge (INR) + 0.3 x 
loge (creatinine mmol/L) + 0.6 x 
loge (bilirubin mmol/L) — 13 x 
loge (Na) + 70] 

Ulle^' 

R-Ulle model = 3.19 - 0.101 
X (age in years) + 0.147 x (albumin 
day 0 in g/L) + 0.0165 x (bilirubin-day 0 
-bilirubin-day 7 (mmol/L)) — 0.206 x 
(renal insufficiency") — 0.0065 x 
(bilirubin-day 0 mmol/l) - 0.0096 x 
(PT in seconds); Lille model = exp (— R)/ 
(1 + exp (-R)) 



PT, prothrombin time; INR, international normalised ratio; MELD, model for end-stage liver disease; DF, Maddrey's discriminant 
function; GAHS, Glascow alcoholic hepatitis score; ABIC, age, bilirubin, international normalised ratio and creatinine score; MELD- 
Na, modified MELD including sodium; UKELD, United Kingdom model for end-stage liver disease. 

* Rather than a formula, GAHS is based on a scoring system. 
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This model, as well as ECBL^^ and a 25% fall in bilirubin 
levels,^^ do not have the same prognostic basis and so 
cannot be compared to other models, as they were spe- 
cifically developed for the assessment of corticosteroid 
response. Thus, these three scores were validated sepa- 
rately in a subgroup of patients with severe AH (admis- 
sion DF >32) treated with corticosteroids, using clinical 
and biochemical parameters obtained on the day before 
treatment start and the evolution in bilirubin at day 7 of 
treatment with steroids. 

Statistical analyses 

Baseline characteristics of the study population were 
compared by using Chi- squared test for categorical data 
and Student t-test or Mann-Whitney U test for continu- 
ous data, as appropriate. Occurrence of death due to any 
cause within 30 or 90 days from the hospital admission 
was the study endpoint. Mortality rates were calculated 
as the proportion of patients that died within these time 
intervals. In patients with severe AH (admission DF 
>32), a Cox proportional hazards model was evaluated 
to assess the crude and adjusted effect of corticosteroid 
therapy with respect to either 30- and 90-day mortality. 
The utility of each model to predict 30- or 90-day mor- 
tality was evaluated using receiver operating characteris- 
tics (ROC) curves, and the area under the receiver 
operating characteristics curves (AUROCs) was calcu- 
lated. In this analysis, a model with an AUROC between 
0.7 and 0.8 was considered clinically useful and between 
0.8 and 0.9 as having very good diagnostic accuracy. If 
the AUROC approaches 1.0, the model approaches 100% 
sensitivity and specificity, indicating a perfect diagnostic 
test.^^ Sensitivity, specificity, positive predictive value 
(PPV) and negative predictive value (NPV) of the mod- 
els were calculated using originally published cut-offs: 32 
for DF, 9 for GAHS, 21 for MELD, 28 for MELD-Na, 
6.71 and 9 for the ABIC and 0.45 for the Lille 
score.'^' As no disease-specific calibration has 

been reported for UKELD, we calculated optimised pre- 
dictive performances using the best cut-off within our 
cohort (point nearest to the top left corner of the ROC 
curve, yielding the best relationship between sensitivity 
and specificity). Comparison between AUROCs was per- 
formed by the method of Hanley and McNeU^^ and the 
P-values obtained were considered indicative of nonsimi- 
larity if below 0.05. All analyses were performed using 
the SPSS version 22 (SPSS, IBM, Chicago, IL, USA) except 
for the comparisons between AUROCs which were per- 
formed using MedCalc version 12.2.0 (Medisoftware, 
Mariakerke, Belgium). 



RESULTS 

Study cohort, biochemical data and scores of the 
different predictive models 

Seventy-one consecutive patients with a biopsy-proven 
diagnosis of AH who met the inclusion criteria, com- 
prised the study population. The baseline clinical data 
and prognostic score values are shown in Table 2. There 
were 47 males and 24 females with a median age of 
49 years. Median admission MELD and DF were 18.8 
and 47.5 respectively. The median interval between 
admission and the date of liver biopsy was 1.5 days 
(range: 0-6 days). This time interval was comparable 
between survivors and patients who died either within 
30 days (P = 0.28) and 90 days (P = 0.76) from hospital 
admission. Overall, the 30-day mortality was 14.1% (10/ 
71), whereas the 90-day mortality was 19.7% (14/71). 
The differences between survivors and nonsurvivors at 
30 and 90 days from admission are shown in Table 2. 
With respect to 30-day mortality, patients who died had 
a higher admission median bilirubin, urea, creatinine, 
prothrombin time and INR, higher prognostic score val- 
ues, lower albumin and were more frequently females as 
compared to patients who survived (Table 2a). Similar 
differences were detected with respect to 90-day mortal- 
ity (except there were no significant gender differences 
and there was a trend for higher admission leucocyte 
count in those who died), and when comparisons were 
repeated by taking into account the 1-week biochemical 
values and scores (Table 2b). Considering the subgroup 
of patients with severe AH (admission DF >32; n = 49), 
patients who died at 30 days (« = 10) had a lower med- 
ian albumin (26 vs. 30 g/L; P = 0.04) and were margin- 
ally more frequently of female gender (6/10 vs. 11/39; 
P = 0.07). With respect to 90-day mortality, patients 
with a severe AH who died (n = 14) had a lower admis- 
sion median albumin (27 vs. 30 g/L; P = 0.05) and a 
higher median creatinine (74.5 vs. 53 mmol/L; P = 0.05) 
as compared to those who survived, whereas no other 
differences were detected between the two groups con- 
sidering all 30-/90-day variables included in Table 2 
(data not shown). 

Data on corticosteroid treatment 

Overall, 49 (69%) patients had a DF >32 at presentation 
and 34 (69.4%) were treated with corticosteroids, 
whereas no patient received pentoxyfiUine or other 
specific treatment for AH. Contraindications for cortico- 
steroid treatment included variceal bleeding in 3 patients 
and infection in 5 patients, including 2 patients with 
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Table 2 | Comparison of admission and 1 weel< variables and scores 


between patients who survived at 30- and 90- 


days and tliose who died. All quantitative variables are given as medians (range) 










30-day 


30-day 




90-day 


90-day 






Total cohort 


survivors 


nonsurvivors 




survivors 


nonsurvivors 




Variable 


(n = 71) 


(n = 61) 


(n = 10) 


P-value 


in = 57) 


in = 14) 


P-value 


a, Admission 


Age (years) 


49 (26-75) 


49 (26-75) 


50 (30-57) 


0.69 


49 (26-72) 


50 (30-75) 


0.79 


Male Gender n (%) 


47 (66.2) 


43 (70.5) 


4 (40) 


0.08 


39 (68.4) 


8 (57.1) 


0.53 


Bilirubin (|.imol/L) 


212 (44-827) 


187 (44-827) 


241 (144-711) 


0.09 


178 (44-827) 


356 (144-711) 


0.006 


Albumin (g/L) 


30 (19-42) 


32 (19-42) 


26 (20-34) 


0.006 


32 (19-42) 


27 (20-34) 


0.005 


Creatinine (i^mol/L) 


57 (31-292) 


54 (31-240) 


74.5 (31-292) 


0.02 


53 (31-240) 


74.5 (31-292) 


0.03 


Urea (mmol/L) 


3.5 (0.8-173) 


3.4 (0.8-173) 


6.7 (2.8-15) 


0.04 


3.4 (0.8-12.8) 


5.4 (2.3-173) 


0.08 


Sodium (mmol/L) 


133 (121-155) 


136 (131-155) 


133 (121-144) 


0.09 


134 (121-155) 


133 (121-144) 


0.72 


Prothrombin time (s) 


22.2 (12.2-45.4) 


21.8 (12.2-45.4) 


26.1 (16.1-39.6) 


0.03 


21.7 (12.2-45.4) 


24.5 (16.1-39.6) 


0.02 


INR 


1.8 (0.9-4) 


1.7 (0.9-4) 


2.2 (1.3-3.2) 


0.02 


1.7 (0.9-4) 


2.2 (1.3-3.2) 


0.009 


Leucocytes (107L) 


11.2 (2.8-34.7) 


11.2 (2.8-25.2) 


11.8 (9.2-34.7) 


0.21 


11.2 (2.8-25.2) 


11.8 (9.2-34.7) 


0.08 


DF 


47.5 (2.2-1577) 


42.3 (2.2 1577) 


57.4 (34-150.1) 


0.04 


40.1 (2.2-157.7) 


58.3 (34-150.1) 


0.007 


GAHS 


8 (5-12) 


8 (5-12) 


10 (7-12) 


0.009 


8 (5-12) 


10 (7-12) 


0.001 


ABIC score 


7.8 (4.4-12.1) 


76 (4.4-11.6) 


9.4 (6.9-12.1) 


0.03 


75 (4.4-11.6) 


9.5 (6.9-12.1) 


0.001 


MELD score 


18.8 (7.9-40.3) 


18.7 (7.9-36.9) 


25.6 (15.4-40.3) 


0.004 


18.2 (7.9-36.9) 


25.4 (15.4-40.3) 


0.0001 


MELD-Na score 


21.2 (6.3-40.2) 


20.3 (6.3-38.2) 


23.6 (15.4-40.2) 


0.08 


20.2 (6.3-38.2) 


26.4 (15.4-40.2) 


0.003 


UKELD score 


57.7 (48.7-72.8) 


57.7 (48.7-72.8) 


57.5 (52.7-69.2) 


0.52 


56.8 (48.7-72.8) 


59.6 (52.7-69.2) 


0.04 




(n = 63) 


(n = 56) 


(n = 7) 




(n = 52) 


in = 11) 




b, Day 7 from admission 


Bilirubin (|imol/L) 


146 (25-647) 


128 (25-647) 


214 (165-568) 


0.01 


120.5 (25-647) 


420 (165-568) 


0.0001 


Creatinine (i^mol/L) 


55 (26-343) 


54 (26-101) 


154 (36-343) 


0.02 


54 (26-101) 


95 (36-343) 


0.007 


Urea (mmol/L) 


4.1 (1.6-22.3) 


4.1 (1.6-20.8) 


9 (2.8-22.3) 


0.04 


4.1 (1.6-20.8) 


8.9 (2.8-22.3) 


0.07 


Sodium (mmol/L) 


136 (111-161) 


137 (125-161) 


136 (111-149) 


0.51 


136.5 (111-149) 


133 (125-161) 


0.25 


Prothrombin time (s) 


20 (1.5-55.5) 


19.8 (16-35.1) 


23.9 (17.1-55.5) 


0.009 


19.8 (16-35.1) 


23.2 (17.1-55.5) 


0.02 


INR 


1.6 (0.9-16) 


1.6 (0.9-3.1) 


2.1 (1.4-4.8) 


0.004 


1.6 (0.9-3.1) 


1.9 (1.4-4.8) 


0.01 


Leucocytes (107L) 


11 (3.6-34.1) 


10.9 (3.6-34.1) 


11.8 (5-31.1) 


0.66 


10.9 (3.6-34.1) 


11.9 (5-31.1) 


0.46 


DF 


28.2 (1.46-194.2) 


276 (1.5-125.7) 


50.9 (34.4-194.2) 


0.002 


272 (1.5125.7) 


48.6 (34.4-194.2) 


0.001 


GAHS 


8 (5-12) 


7 (5-12) 


9.5 (8-11) 


0.01 


7 (5-12) 


9 (8-11) 


0.003 


ABIC score 


7.2 (4.1-17.4) 


71 (4.1-174) 


8.3 (6.9-11) 


0.04 


7 (4.1-174) 


79 (6.9-11) 


0.02 


MELD score 


15.7 (1.38-46.5) 


15.1 (1.4-34.1) 


28.7 (13.9-46.5) 


0.002 


14.6 (1.4-34.1) 


25.6 (13.9-46.5) 


0.0001 


MELD-Na score 


18.5 (-7.3-44.1) 


18.4 (-7.3-34.4) 


278 (14.8-44.1) 


0.01 


179 (-7.3-34.4) 


25.6 (14.8-44.1) 


0.001 


UKELD score 


55.5 (43.2-72.8) 


55.4 (43.2-68.3) 


60.6 (48.4-72.8) 


0.08 


55 (43.2-68.3) 


61.7 (48.4-72.8) 


0,006 



INR, international normalised ratio; DF, Maddrey discriminant function; GAHS, Glascow alcoholic hepatitis score; ABIC, age, 
bilirubin, INR, creatinine score; MELD, model for end-stage liver disease; MELD-Na, modified MELD including sodium; UKELD, 
United Kingdom model for end-stage liver disease. 



spontaneous bacterial peritonitis (diagnosed by a neutro- 
phil count >250 cc/mm^ in the ascitic fluid), whereas 
another two patients refused corticosteroid treatment. In 
five patients, the decision for not treating with corticos- 
teroids was based on the clinical judgment of the treat- 
ing physician, despite no obvious treatment 
contraindications. Mean time between admission to 
Royal Free Hospital and start of corticosteroid treatment 
was 2.44 ± 1.88 days (range: 0-8 days). That time 
interval was comparable between patients who survived 
at 30 (2.64 ± 1.98) and 90 days (2.61 ± 1.88) and 
those who did not (1.5 ± 0.84; P = 0.19 and 
1.87 ± 1.88; P = 0.21, respectively). Considering 49 



patients with an admission DF >32 (i.e. those expected 
to benefit from corticosteroids^^), the 30-day survival 
rate was 28/34 (82.4%) in steroid-treated patients vs. 11/ 
15 (73.3%) in nontreated (P = 0.47). At 90 days, the 
survival rate was 26/34 (76.5%) in patients receiving 
corticosteroids vs. 9/15 (60%) in nontreated (P = 0.31). 
In patients with a severe AH (admission DF >32), the 
crude hazard ratio (HR) for corticosteroid treatment 
was 0.57 [95% confidence interval (CI): 0.16-2.05; 
P = 0.40) with respect to 30-day mortality and 0.55 
(95% CI: 0.19-1.58; P = 0.27) with respect to 90-day 
mortality. Lack of a significant corticosteroid effect on 
mortality persisted after adjustment for admission 
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variables differing {P < 0.1) between survivors and non- 
survivors (i.e. gender and bilirubin with respect to 
30-day mortality; creatinine and albumin for 90-day 
mortality), yielding for 30-day mortality: HR = 0.69 
(95% CI: 0.19-2.50; P = 0.57) and for 90-day mortality: 
HR = 0.56 (95% CI: 0.18-1.71; P = 0.31). 

Use of the MELD, DF, GAHS, ABIC, MELD-Na and 
UKELD for the assessment of 30- and 90-day 
mortality 

The ROC curves of the models with respect to 30-day 
and 90-day mortality are shown in Figure 1. The AU- 
ROCs for the prediction of 30-day mortality ranged from 
0.56 for UKELD to 0.79 for MELD and for the predic- 
tion of 90-day mortality between 0.68 (UKELD) and 
0.84 for MELD (Table 3A). No significant differences 
were found in pairwise comparisons between the 
AUROCs of the different models (data not shown). 
Re-calculation of the scores at day 7 from admission was 
possible for 63 patients; three patients died before day 
seven, and five patients (all of whom survived) did not 
have all the required biochemical data available at this 
time point. Re-scoring on day 7 generally yielded a trend 
towards increased AUROCs, ranging to 0.69-0.85 for 



30-day and 0.75-0.86 for 90-day mortality (Table 3B). 
However, none of the differences reached statistical sig- 
nificance, and there were no statistically significant dif- 
ferences in the pairwise comparisons between models 
(data not shown). Figure SI shows scatter plots of 
admission score values related to 30- and 90-day mortal- 
ity, including representation of both originally published 
and optimal cut-off points within our cohort. The later 
were 44 for DF, 28 for MELD-Na and 56 for UKELD. A 
high ABIC cut-off of 9.5 resulted in increased specificity 
as compared to the originally suggested value of 9 (90% 
vs. 80% and 95% vs. 84% for a 30- and 90-day outcome 
respectively). Previously suggested MELD (21), GAHS 
(9) and low ABIC (6.7) cut-off points performed opti- 
mally within our cohort. Using originally pro- 
posed cut-points, the negative predictive values (NPV) 
for ruling out short-term mortality were high (mostly 
exceeding 90%), whereas ability of the models to cor- 
rectly predict occurrence of death (positive predictive 
value; PPV) was substantially lower, in most cases less 
than 50% (range: 17-57%) (Table 3a). These properties 
remained largely unchanged when the predictive perfor- 
mances of the models were re-assessed 1 week from 
admission (NPV: 0.85-1.00, PPV: 0.20-0.57; Table 3b). 




90-clay mortality 



1- Specificity 
Diagonal segments are produced by ties. 
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Figure 1 | Receiver operating characteristic curve of the different prognostic scores for alcoholic hepatitis calculated 
on admission, used to predict 30-day (a) and 90-day (b) mortality. 
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Table 3 | The AUROC an 


id optimal operational characteristics in predicting 30- and 90-day mortality for the different 


prognostic scores calculated on the day of admission (a) and re-calculated after 7 days (b) 






Score 


AUROC 


Std. err 


95% CI 


Sensitivity* 


Specificity* 


PPV* 


NPV* 


a, Admission (n = 


71) 














30-day mortality 


MELD 


0.79 


0.085 


0.62-0.95 


0.80 


0.79 


0.38 


0.96 


DF 


0.71 


0.092 


0.53-0.89 


1.00 


0.36 


0.20 


1.00 


GAHS 


0.75 


0.073 


0.61-0.89 


0.90 


0.59 


0.27 


0.94 


ABIC 


0.71 


0.079 


0.55-0.86 


1.00/0.60 


0.20/0.80 


0.17/0.33 


1.00/0.92 


MELD-Na 


0.68 


0.087 


0.50-0.84 


0.30 


0.82 


0.22 


0.88 


UKELD 


0.56 


0.087 


0.39-0.73 


0.90 


0.43 


0.21 


0.96 


90-day mortality 


MELD 


0.84 


0.064 


0.71-0.96 


0.86 


0.84 


0.57 


0.96 


DF 


0.74 


0.062 


0.61-0.86 


1.00 


0.39 


0.29 


1.00 


GAHS 


0.78 


0.060 


0.67-0.90 


0.93 


0.63 


0.38 


0.97 


ABIC 


0.78 


0.072 


0.64-0.92 


1.00/0.64 


0.21/0.84 


0.24/0.50 


1.00/0.91 


MELD-Na 


0.76 


0.069 


0.62-0.89 


0.43 


0.86 


0.43 


0.86 


UKELD 


0.68 


0.076 


0.53-0.83 


0.93 


0.46 


0.30 


0.96 


b, Day 7 from admission (n = 


= 63) 












30-day mortality 


MELD 


0.84 


0.084 


0.68-1.00 


0.60 


0.87 


0.43 


0.93 


DF 


0.85 


0.058 


0.74-0.96 


1.00 


0.48 


0.24 


1.00 


GAHS 


0.77 


0.070 


0.63-0.91 


0.60 


0.70 


0.25 


0.91 


ABIC 


0.74 


0.092 


0.54-0.90 


0.80/0.30 


0.46/0.93 


0.20/0.43 


0.94/0.89 


MELD-Na 


0.78 


0.090 


0.61-0.96 


0.40 


0.93 


0.40 


0.92 


UKELD 


0.69 


0.106 


0.48-0.89 


0.50 


0.69 


0.21 


0.89 


90-day mortality 


MELD 


0.86 


0.062 


0.73-0.98 


0.50 


0.88 


0.50 


0.88 


DF 


0.84 


0.054 


0.73-0.94 


1.00 


0.51 


0.33 


1.00 


GAHS 


0.79 


0.061 


0.67-0.91 


0.64 


0.74 


0.37 


0.89 


ABIC 


0.75 


0.077 


0.58-0.88 


0.79/0.29 


0.49/0.95 


0.28/0.57 


0.90/0.57 


MELD-Na 


0.83 


0.068 


0.69-0.96 


0.36 


0.91 


0.50 


0.85 


UKELD 


0.77 


0.084 


0.60-0.93 


0.57 


0.72 


0.33 


0.87 



AUROC, area under the receiver operating characteristics curve; Std. err, standard error; CI, confidence interval; PPV, positive pre- 
dictive value; NPV, negative predictive value; MELD, model for end-stage liver disease; DF, Maddrey discriminant function; GAHS, 
Glascow alcoholic hepatitis score; ABIC, age, bilirubin, international normalised ratio and creatinine score; MELD-Na, modified 
MELD including sodium; UKELD, United Kingdom model for end-stage liver disease. 

* Cut-off values used: MELD: 21, DF: 32, GAHS: 9, ABIC: 6.71/9, MELD-Na: 28, UKELD: 56. 



Models proposed to assess corticosteroid 
responsiveness 

Calculation of the ECBL, a fall in serum bilirubin by 
25% and the Lille score was possible for 31 of 34 
patients with severe AH (admission DF >32) receiving 
corticosteroids; two patients died before day 7 of corti- 
costeroids, whereas an additional patient did not have all 
the required biochemical data. Comparisons in biochem- 
ical and clinical parameters between corticoste- 
roid-treated patients who survived and those who died 
(six patients by day 30 and eight patients by day 90) are 
shown in Table 4. After 1 week of corticosteroids, 14 
(45.2%) patients had a 25% fall in bilirubin from base- 
line, 22 (71%) had an ECBL, and 22 (71%) achieved a 



LUle response using the proposed cut-point of 0.45. The 
AUROC analysis and operational characteristics of the 
three models with respect to 30- and 90-day mortality 
are shown in Figure 2. Overall, no statistically significant 
differences were found in AUROC comparisons between 
the different models relative to either a 30- and 90-day 
outcome. Notably, a good response to corticosteroids 
according to all three criteria yielded an excellent NPV 
for excluding short-term mortality (>85%), whereas the 
PPVs were substantially lower, in all cases <60%. 

DISCUSSION 

The present study is an external evaluation of nine prog- 
nostic models of AH using a 100% biopsy-proven 
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Table 4 | Clinical and biochemical parameters used for the calculation of the Lille score in patients with severe 
alcoholic hepatitis (admission Maddrey >32) treated with corticosteroids (n = 31). Comparisons regard patients who 
survived at 30- and 90-days and those who died. Quantitative variables are given as medians (range) 





"^O-Hax/ Qi iKv/iwnK'c 


"^O-H^iv/ nnnci iK'\/i\/fM'c 
j\j Udy iiumjUivIvUio 




QO-Ha\/ CI ir\/i\/fM'c 
7U Udy ^Uivivuia 


QO-Hsx/ nnnci irv/iv/nrc 
7W Udy iiUMoUivivui^ 




Vdi Id ui c 






P value 








Age (years) 


49 (26-59) 


50 (30-54) 


0.88 


49 (26-59) 


49 (26-59) 


0.96 


Male gender n (%) 


21 (84) 


3 (50) 


o.n 


19 (82.6) 


5 (62.5) 


0.33 


Bilirubin-day 0 (|.imol/L) 


328 (87-786) 


397 (176-647) 


0.63 


318 (87-768) 


456.5 (176-647) 


0.31 


Bilirubin-day 7 (|imol/L) 


209 (45-768) 


368 (175-851) 


0.09 


202 (45-768) 


427.5 (175-851) 


0.03 


zIBilirubin (|.imol/L) 


64 (-117-286) 


-31.5 (-204-87) 


0.03 


64 (-117-286) 


-31.5 (-204-129) 


0.03 


Albumin (g/L) 


30 (18-41) 


28.5 (20-36) 


0.25 


30 (18-41) 


28.5 (20-36) 


0.25 


Creatinine (|imol/L) 


53 (29-209) 


103.5 (26-267) 


0.05 


52 (29-209) 


90 (26-267) 


0.03 


Prothrombin time (s) 


23 (18.7-35.6) 


26 (18.3-50.5) 


0.09 


23.8 (18.7-35.6) 


23.5 (18.3-50.5) 


0.23 


INR 


2 (1.4-3.1) 


2.25 (1.4-4.7) 


0.12 


2 (1.4-3.1) 


2.15 (1.4-4.7) 


0.16 


Lille score 


0.14 (0.01-0.91) 


0.75 (0.18-0.99) 


0.02 


0.09 (0.01-0.91) 


0.75 (0.14-0.99) 


0.009 




1 - Specificity 1 - Specificity 

Diagonal segments are produced by ties. Diagonal segments are produced by ties. 



Score 


AUROC 


Std. err 


95% CI 


Sensitivity 


Specificity 


PPV 


NPV 










30-day mortality 








ECBL 


0.73 


0.123 


0.49-0.98 


0.67 


0.80 


0.44 


0.91 


25% fall in 


0.78 


0.084 


0.62-0.94 


1.00 


0.56 


0.35 


1.00 


bilirubin from 
















baseline 
















Lille score* 


0.81 


0.019 


0.66-0.97 


0.67 


0.80 


0.44 


0.91 










90-day mortality 








ECBL 


0.73 


0.113 


0.50-0.95 


0.63 


0.83 


0.56 


0.86 


25% fall in 


0.72 


0.099 


0.53-0.91 


0.88 


0.57 


0.41 


0.93 


bilirubin from 
















baseline 
















Lille score* 


0.82 


0.079 


0.66-0.97 


0.63 


0.83 


0.56 


0.86 



AUROC, area under the receiver operating curve; ECBL, early change in bilirubin levels; CI, confidence interval; 
PPV, positive predictive value; NPV, negative predictive value. 
'Using a Lille score cut-point of 0.45^^. 



Figure 2 | AUROC analysis and operational characteristics for three different indicators of response to corticosteroid 
treatment, used to predict 30-day (a) and 90-day (b) mortality. 
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cohort, including the first disease- specific assessment of 
the UKELD used for prioritizing Uver recipients in UK.^^ 
Overall, MELD, DF, GAHS and the ABIC proved to be 
clinically useful scores, performing comparably and with 
an acceptable accuracy (AUROCs exceeding 0.70) for 
both 30- and 90-day mortality. Our findings are congru- 
ent with those of previous validation studies in which, 
however, diagnosis of AH was based solely on clinical 
grounds. In a Danish study including 274 patients, 
MELD, MELD-Na, GAHS, Lille score and the ABIC also 
performed comparably in predicting 28-, 84- and 
180-day mortality.^" Similarly, in another study, MELD, 
DF, ABIC and GAHS performed equally in predicting 
short-term (30- and 90-day) survival, although all mod- 
els were uniformly poor in predicting longer-term 
(6-month and 1-year) outcome.^'' In a prospective com- 
parison of 182 patients, DF, GAHS, MELD and ABIC 
performed well with no statistically significant difference 
for either 28 or 90 days mortality after admission.^^ We 
observed a tendency towards better prognostic accuracies 
with respect to 90-day mortality (90-days AUROCs: 
0.68-0.86 vs. 30-days AUROCs: 0.56-0.79), and when 
assessment of prognosis was repeated 1 week from 
admission (90-days AUROCs: 0.75-0.86 and 30 day AU- 
ROCs: 0.69-0.85). This is consistent with previous obser- 
vations on the utility of repeated scoring, 6-9 days from 
hospital admission.^^' 

DF was developed several decades ago based on 
patient cohorts that might have had different supportive 
care than in current patients. Our study, in agreement 
with previous observations, indicates an inadequate spec- 
ificity (<40%) for mortality of DF: 39/49 (79.6%) of 
patients with DF >32 were alive by day 30 and 35/49 
(71.4%) by day 90. Obsolescence of the cut-point of 32 
may account, at least partially, for the inaccuracy of DF, 
and higher cut-offs have been proposed: 37 in the study 
by Dunn et al.}^ and 42 in the study by Sheth et alP 
Optimal cut-off within our cohort corresponds to 44, 
although even by using this value, specificity of DF 
would be still less than 60%. This inaccuracy of DF has 
been suggested as the basis of the long-standing debate 
on the efficacy of corticosteroid treatment. Moreover, 
poor standardisation of PT across different laboratories 
represents another limitation affecting the reproducibility 
of this index.^*^ 

Previous studies^^"^'' have proposed use of MELD as 
an alternative model, more specific for mortality, as 
compared to DF. As well as INR which has problems in 
reproducibility of measurement,^'' MELD includes creati- 
nine, a relevant prognostic indicator in AH, but which 



measurement in a context of hyper-bilirubinemia is also 
problematic.^^ Our data, consistent with these older 
reports, shows sensitivity/specificity of 0.80/0.79 
(30-days) and 0.86/0.84 (90-days) for the MELD vs. 
1.00/0.36 (30-days) and 1.00/0.39 (90-days) for the DF. 
In a study by Srikureja et al, 1-week MELD has been 
shown to be more accurate, as compared to admission 
MELD, for the prediction of in-hospital mortality.^'' Our 
results are further validating this observation with 
respect to both the prediction of 30-day (AUROC: 0.84 
for 1 week MELD vs. 0.79 for admission MELD) and 
90-day risk of death (AUROC: 0.86 for 1 week MELD 
vs. 0.84 for admission MELD). However, in contrast to 
this last study, we could not identify any advantage in 
1-week re-testing of MELD over DF, as re -calculation of 
the DF 1 week from admission also yielded excellent 
prognostic accuracies (AUROC >0.80), comparable to 
those obtained by recalculating MELD. Disease-specific 
calibration of MELD is an issue: in our cohort the opti- 
mal cut-off value was 21, simUarl to the study by Dunn 
et al}^ However, lower thresholds such as 11 reported 
by Sheth et al}^ and 18 reported by Shrikureja et al.^^ 
have been proposed, probably reflecting differences in 
the grade of severity of disease between different cohorts. 
Neither MELD-Na nor UKELD, both modifications of 
MELD incorporating sodium, were prognosticaUy supe- 
rior to classical MELD within our cohort. Previously, 
MELD-Na was shown a stronger predictor of 180-day 
mortality (vs. MELD) when patients with clinically diag- 
nosed AH and ascites were considered. However, the 
small sample size in this study (26 patients, 13 with asci- 
tes) precludes definitive conclusions.^* 

GAHS and ABIC are disease- specific formulas which 
also include creatinine^'*' but are easier to calculate at 
the bedside in comparison to MELD. The ABIC score 
includes similar parameters as the MELD score, except 
for patient's age, whereas GAHS is the only index to 
consider an inflammatory parameter (white cell count) 
(Table 1). Although GAHS and ABIC have been shown 
to perform significantly better than DF within their 
internal validation cohorts, our results suggest compara- 
ble predictive accuracies of these three models. In a head 
to head comparison of GAHS and ABIC using 181 
patients from the GAHS validation cohort, the two mod- 
els also performed equally.^* ABIC is a dual cut-off 
model which generates a trichotomous classification into 
low, intermediate and high risk of death.** Critically, 
such ability was questioned in a recent study in which 
the 3 stages did not result in differences in 90-day out- 
comes between the 'low' and 'intermediate' groups, 
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although there was a clearly worse outcome in those 
with ABIC >9.^^ Unfortunately, the relatively small num- 
ber of events precluded us from undertaking a similar 
analysis. 

The Lnie score is a combination of six reproducible 
variables including a dynamic one, i.e. the evolution in 
bilirubin following 1 week of corticosteroid treatment.^^ 
Within our biopsy-proven cohort, use of the Lille model 
proved an accurate predictor of both 30- and 90-day 
outcome (AUROC 0.81/0.82). However, our data do not 
justify its complexity over the much easier bedside calcu- 
lation of ECBL and 25% fall in bilirubin, both perform- 
ing with an acceptable grade of accuracy within our 
cohort (AUROC 0.72-0.78). Our findings are congruent 
with those of a recent prospective assessment, in which, 
however, diagnosis of AH relied solely on clinical crite- 
ria.^^ As availability of 7-day biochemical data are neces- 
sary in order to calculate the Lille score, it is interesting 
to note that re-calculation at 7 days of either the MELD 
or DF, also provided excellent predictive accuracies (AU- 
ROCs >0. 80). However, their dynamic evolution (i.e. 
AMELD and ADF) has been reported to have less prog- 
nostic power in comparison with changes in Lille score. 
Thus, the Lille model represents the best currently vali- 
dated dynamic criterion for the assessment of mortality 
in AH, and the only one linked to specific stopping rules 
for corticosteroid management: in poor responders (Lille 
>0.45) discontinuation of corticosteroids is recom- 
mended,^^ particularly when Lille >0.56 (i.e. considered 
null responders). 

Importantly, all prognostic models and in particular 
DF, MELD, GAHS, ABIC and the Lille score showed 
excellent NPV, in most cases exceeding 90%. This is in 
contrast with PPV which were low, in most instances 
lower than 50% (Table 3). This finding suggests that the 
paradigm in clinical decision making and the designation 
of clinical trials targeting specific treatments in AH, 
should be to exclude low-risk patients, rather than to 
identify those with high death risk, using these models. 
Clearly, some patients identified at high risk may 
receive futile treatment but evaluating different thresh- 
olds or different weighting or new variables may refine 
prognosis. 

The present study has limitations. We did not per- 
form a sample size calculation, and our study cohort was 
based on the available patients but it does reflect a 
4-year single-centre experience of histologically diag- 
nosed AH. Therefore, although our sample size is com- 
parable^'' and more than double^' than that of previous 
publications, our study may be underpowered to detect a 



significant difference in the predictive performances 
between models. This is more likely to be true in the 
analysis of the Lille score and its variants, which was 
restricted to 31 patients receiving corticosteroids. Treat- 
ment with corticosteroids may have led in underestima- 
tion of the predictive ability of general prognostic scores, 
although a survival benefit conferred by this treatment 
remains in some dispute*' ^ and corticosteroid-treated 
patients have been previously included for the develop- 
ment and/or validation of the models. Despite 
adjustment for confounding variables there was no 
demonstrable corticosteroid effect on survival, which is 
unsurprising, considering that our study was not 
designed nor powered to detect a therapeutic effect. 
However, this may indicate that treatment effects are not 
a significant source for biassing predictive performances 
in the present study. Transjugular liver biopsy^"' is rou- 
tinely performed in our institution whenever AH is sus- 
pected. Although inclusion of less severe cases by this 
institutional policy (and thus changes in predictive accu- 
racies of the models) could be possible, the admission 
MELD in our series is comparable to that of other 
cohorts in which diagnosis of AH relied solely on clinical 
criteria. ^° Congruently, the 30- and 90-day mortality 
in our cohort was 14.1% and 17.9% respectively, which 
is consistent with previous studies reporting short-term 
mortality ranging to 14.4-27%.^^"^^' 

In conclusion, the negative predictive values of MELD, 
DF, GAHS and the ABIC as well as those of three differ- 
ent scores to assess corticosteroid response, proved to be 
valid for prognostication when assessed in an indepen- 
dent cohort of patients with biopsy-proven AH. In our 
series, both MELD-Na and UKELD did not confer any 
prognostic advantage in comparison with classical 
MELD. The choice of prognostic model thus depends on 
other factors including ease of use, routine use of corti- 
costeroids according to institutional practice and the per- 
sonal preferences of the treating physician. However, 
there is still room for further refinement, and efforts for 
improved prognostic models should continue, as there is 
increasing need for accurate prognostic stratification in 
AH, particularly with the possibility of early liver trans- 
plantation. Thus, it is important currently to rely on 
response criteria to corticosteroids, or non-improvement 
at 7-days if liver transplantation is considered, as the 
PPV of all models is insufficient to establish a poor 
prognosis at admission. 
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